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Crohn** disease 14 and ulcerative colitis, the two main types of 
chronic Inflammatory bowel disease, are radtfcctorial condi- 
tion* of unknown aetiology. A mceptibility kens for Crohn 5 * 
disease has been mapped 1 to dtttraiosomel&Herewehavetweda 
po$hiona]-daniJQkg strategy, baaed an linkage analysis followed by 
linkage dssenjrnlbrinm mapping, to identify three independent 
associations for Crohn's disease: a framesMft variant and two 
Tnisggnse variants cf NOD2, encoding a member of the Apaf-1/ 
Ced-4 su^ei^iinity of apopcosis regulators that is expressed in 
monocytes- These NOW vwdauts alter the structure of either the 
leudne4ich repeat domain of the protein or. the adjacent region* 
NOD2 activates unclear factor NP-kB; this activating function is 
regulated fay the carbcay^eiminal leadne-rM repeat domain, 
which has an Inhibitory role and also acts as an mtracelfauar 
receptor for components of microbial pathogens. These observa- 
tions suggest that the KOD2 gene product confers susceptibility to 
Crohn's disease by altering the recognition of these components 
and/or by_ove>icdYatmg NF4cB in monocytes, thus documenting 
a mokaijar model for the pathogenic mrehanifirn of Crohn's 
disease that can now be fartl^ investigatEA 
"Crohn's.disease (CD; MIM 266600) occurs primarily in yonng 



.^A^Iestimated>jrevalence of l^m^^JOp^'^tfestan: 
iSi^Jts^ incidence has ina^ed nmlgeciy b m. tlic 'past naif 

rnenral fectoish* Familial a ggwffitfan . of the 1 caseaseisuggests- that- 
genetic factois- may also be involved— an hypothcds^triat was^ 
substantiated kr 1996 by the discovery of* a suscept^tyJcjajs fot 
CD, JBDU on chromosome 16 (re£ 3). Ideritffication of the exact 
nature of the generic changes that arc nrrplicated in CD susccp- , 
tibility would* provide a specific approach to understand^ mis 
commonKiisoTder . 

Because candidate genes previously localized on chromosome 16 
failed to show an association with CD* 5 , we refined the localization 
of the IBD1 susceptibility locos by typing 26 microsatdlite markers 
spaced at an average distance of 1 cM in the pericemromeric region 
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Rgts? 1 Strategy ussd to identify the SD 1 !oct£ a, PToSa far the muffipont noo- 
paranoic Onte^e ^Pl) scores. ApproKtats cytcganstlc locaQzatlons htb shown 
selected mlcrcsaisilto martos used h flra stud/ 1 that locaited 801 to M 
pater tromeric region cf cluomosome 1 6l Suteaqusnt linkage anarystewere focused on 
the region tetween SPM and D16S40& Tla Wshest Nrl score (liiaxlmuml^scors, 
3.49;/>«237xlOw3sfrifoe ratf on b st»een matters D1 6S54t and 01 6S2623 
(fEt GJ. D # Ph^ca* map c4 ti9 S£?T fegtor 5 . Whtte and bcack boxes comspofxj to 9ie two 
BACcontigsand BAC clone hnB7b10, respecfivdy. F^^artlflciafcrraTios^ 

brlo^ & gap of betwssn tta 
irrapcitheirtossteitemafto 
between D1 6SS41 end m 65^ is -2Mb. c^ Ito^ 
ccTdsi^^iSOTcaidlri^ omotoe^csrddsts 
. o^esre InllcafiKl by whaesnJ biack boxes, respec^. Bold Miuiy \am Indlcatss 
dlrsettonoftra^^ 

' typed In 236.03 famles ftr Snkage- diaequiDteiiim snifes, ars stow ■ • ' 
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aantigs. spanning- this res2on;<Fig^lb)k vAi&iekpported Engage 
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performed on a single* trio from each of. ICS- (77 nraldplex and 
31 simplex} families, showed a .bordedine- : si ffi i£ c a nt association 
(p< 0,05) between the disease phenotype and- the 207- : base-pair 
(bp) allele of D16S3136. This observation wa$ : replicated : yith 
another set of 76 femBiea, although with a different allele (the 
205-bp allele; P < 0,0 1 ), These two observations may be due to type- 
one errora. Alternatively, they may reflect true association in two 
sets of families drawn from genetically different populations. 

The latter hypothesis led to the following strategy: a 164-lcb BAC 
done(hb87bl0) from the (32PH-B AC library containing D16S3136 
was sequenced completely QSMBL accession number AJ303140). A 
public database search extended the sequence of the coirespondiiig 
region to 260 kb but did not identify characterized genes> with the 
exception of KIAA0349, which codes '(or a iibicarithi C-ternnnal 
hydrolase homologue in Caenerhabd&is elegans. However, analysis 
by GRAIL and an expressed sequence tag (EST) homology search 
identified many putatrvely transcribed regions (Fig. lc). 

Eleven dngk-nudeotide polymorphisms {SNP 1-11) selected 
from these regions were genotyped, together with microsatellife 
markers D16S3035 and D16S3136, in a total of 235 available CD 
faimliM (Table 1). Strong linkage disequilibrium was observed 
among most markers (data not shown). Several SNPs showed 
gjgnifaant association with CD by the pedigree diseqmBbrium 
test 5 (PDT), confirnnng the existence of linkage diseo^uTibrium, 
with the disease locus over the investigated region (specially SNP 2, 
nominal P vaiueO-Qp002; Table 1). 

These observation's prompted the characterization ofneighbour* 
ing Unigene clusters (Fig. lc). Eleven overlapping dones, isolated 



gene'and'sliov^ fi^addirlonal 
cUseBse^re&ttd^^ cams of this gene were sequenced in 50 
^ rmrriii'teft ^ sibling pair 

icWracalbycIeisc^ 16 homologous regions. 

Two addrtUmaTniah^syn^ (SNP 12 and 13), wirkrarc- 

allde frecpenc^ : -greatei^than 0jQ3, were identified and subse- 
ojtentlyjused to type the_235_CD. families. . , 

The PDT was" most agnifiranf for SNP 13 (P = 6x ltT^Parruues 
were divided into two groups: those with at least one member 
carrying ie '^"aHeJeofSNP 13 and those without this allele. The 
latter group; pffin^e^Jfeiled to show association between CD and 
SNP 4-6*, and showed consderable decrease in the significance of. 
the SNP 2 association. This result indicates that the assocktions of 
these four iod with CD were not independent of SNP 13 (Table 1). 
In contrast, significance of the CD associations with SNP 8 and 12 
decreased modestly in these families indicating a minimal con- 
tribution of die rare SNP ; 13 allele to these associations. 

The 8 intragenic SNPs mat were initially identified defined 41 
different haptotypesu Three of these revealed preferential transmis- 
sion to affected mdrviduals (Table 2). These three haplotypes each 
contamonerareanekofSNP8, Uor 13macomext of a common 
background. Notably, the haplotype denned by the same back- 
ground and by the absence of these rare alleles did riot show such 
transmission distortion (Table 2). Furthermore, the rare alleles of 
SNP 8, 12 arid 13 were never found on the same haplotype, 
indicating independent association of CD susceptibility with each 
of three non-synonymous variants of a same gene. 

Asaresultoftheseassomtions.tfo 12 
and 13 differed in the group of CD patients as compared with 
controls (Table 3). Average risks for CD, computed for genotypes 
containing zero, one or two variants (Table 3), revealed a gene- 



Tabte 1 Untoga dteg<miflfrrtmtt anatysow art tfaa SBDi locus 
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lUhcraBfc mt^assodatecl wimtwo varian t/^*;: ^^„ > -^ >i ^ _ 
^TvSaatu^^ 

It n^ ^so conixiWe toi.ccplam- tl^aiiisi^;pitcte!Mx ; of the 
affected dWmg^pair analysis in ma^u^ t^'soso^^ 

Hie rarcalkk of SNP S was dSsadate^0^^ , v^1^2O5Ayp 
allele ofD16S3136 and negatively withtticiw:tt>p.aDeIe, The inverse 
association was noted for the rare alkies of SNP 12 and 13, thus 
providing a rationale for the initial observations made' with this 
rakrosatellite marker (data not shown). Genotype frequencies were 
comparable in CD patients originating from uniquely and multiply 
Affected kindred — an observation campatihle with the close ^i"^>1 
similarity of the sporadic and familial diseases 1 *. The observed 
linkage of CD to chromosome 16 could not be entirely explained 
by the present associations, because GeneHunter analysis of $5 
multiplex families without SNP 8, 12 and 13 revealed a component 
of linkage (nonparametdc lod score (NPL) 1.6, pofnrwise signifi- 
cance P<0 r 02), Thus, other variants of this gene or additional 
genes on chromosome 16 maybe involved in CD susceptibility, - t 
' Genotyping of 167 patients with ulcerative colitis revealed geno- 
type frequencies comparable to those of controls, indicating that 
these SNFs were not associated with susceptibility to ulcerative 
colitis — an observation in agreement with its lack of linkage to the 
IBD1 locus 1 *. 

The candidate IBD1 gene haa high expression in leukocytes, but 
low or no expression in the other investigated tissues; including 



memBerZof^ ;:*. 



amino terminmtp its ca|b^.tenn1^^ 
caspase- recruitment' doina ms - , (Qji iSD), a nndec^e-binding 
domai*(NBD)ianas^^ of 
N0D2ftaabihdrr#^ (LPS) 
and its de^em'stoto 

ThVjiare attelfcof S^iy^^c^toi 1 -bp insertion hi exon 
10 (9S0&) predicted to t r\mcatf - NOD2m theXRR region. Those of 
SNP *$ "and 12 cap^ "n<m^iiy ^ljvy; substitutions in the LRR 
domain (G8$lR)';and ui'^e projo^'adjacent region {R575W), 
respectively (Fig. 2a)* Systerscatic aquenqng of the coding sequence 
of NOD2 revealed- additional yery^rare^missense variants; which 
to gether were observed in 5% of controls and 4% of patients with 
ulcerative- colhi£ :5 Tij^ 17% for CD patients, 

where the most frequent variants tended to cluster in the LRR and 
its adjacent regions (Fig. 2b). This excess suggests that, m addition 
to SNP 8, 12 and 13, more variants in this part of the NOD2 protean 
may be associated with CD susceptibility. Thus, the LRR domain of 
CD-assodated variants is Ukerf to be impaired, possibly to various 
degrees, in its recognition of microbial components andtor in the 
physiological inhibition of NOD2 dimeraatiori, thus resulting in 
the inappropriate activation of NF~kB in monocytes. . 

Much evidence supports bacteria-induced NF-kB disregulatkrn 
in CD. First, susceptibility to spontaneous inflammatory bowel 



Tab!* 2 Transmission dbequDibriutn of NQDZ tiapJotypea In CD fa raffles 
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Figure 2 Representation offte ED1 /K0C2 protein variants. The translation product 
cfec*xsd frumthecONA sequence of the carafttefc jBDJ gene Is Identical tothat of 
(ret 14). The polypeptide carrta^ to caqsse recruitment domains (CARD), a 
nucJwUde-bWhg domain (fffiD) end ten 27-amino-acJd, leudne-nch repeats (LRR4 
Rack tide indicates tha consensus sequence of the ATP/GTTMiindlng site motif A p- 
loop) of me iw. TRe sequence changes encoded 6y the tnree main vartenfc «eoc(ated 
w/fth fl) are SNP 8 (RB7SW), SNP 12 {5881FQ and SNP 13 (960 frameshifv. TWs ' 
fraineshin changes a leucine to a pmms at position 980. and is frnmeoTatsiyftiiowed oy a 



disease (IBD) in mice has been associated with mutations in Toll- 
like receptor 4 (TLR4)— a member of a family of NP-kB activators 
mat is known to bind LPS through its LRR domain 1 " 0 . Second, 
antibiotic therapy causes, transient Improvement of CD patients, 
supporting the hypothesis that enteric bacteria may have an 
aetidogical role in CD 21 . Third, NF-kB has a pivotal role in TED 
and is activated in mononuclear cells of the intestinal lamina propria 
in CD 29 . Last, CD tre atn i enL is based on the use of suiphasalazine 
and glucocorticoids — two known NF-kB inhibitors 2 * 24 . 
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Frequencies cTtfra three rare variant alleles 
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Si 
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slip cotton. SNP 5 is deserted in Table 1. The afletefrsoueriiiescf the\E28l 
polymorphism were not sJffnBfcanfly different (0,92r0i^ in the ft rse groups, and fte 
rnnespontf rggeno^pea were in HaroV-Weinoetg ecuilMm. The positions of the rarer 
mlssense variants, observed In 457 CO patients, 153 ufceratiuB mlffis patients and 103 
unaffected unrelated Individuals, are fcdeated for these croups. Left scale Indicates the 
number of each Identified variant In the Investigated groups; Tight scale measures the 
mutation frequency. 



Genedc susceptibility to CD U not limited to diromwomel$ and 
at least five additional lod have been implicated 2 *" 39 . Ine recogni- 
tion of a transduction pathway that, when deregulated, contributes 
to the pathogenesis of CD will accelerate the discovery of additional 
susceptibility genes. It wQl also contribute to the identification of 
associated environmental factors and focus the search for specific 
therapies, □ 



Famfles, rcioresataBrta markBre and CDn% construction 

A tutfil of 235 CD famSBea <I17 amplec oadeu fennEr^ 96 niuldpl« miito tefli^ 
aod 22 ateoded pgdlgues, corresponding to t total of 179 CD psttat* and 261 
uiafiectfid Pcktim) prograaiwir recruited tocording to puhlidttd diagnostic 
criterii". Ia iddaioQ, 100 multiplex tad 59 «bipkxiWt^cofitis feinflio were 
recruited from the game hospital*. Wnttca fnfonnrd consent wis obtained fiom all 
participants, AH id&tEFci torn 77 multiple* bn&sz wer« typed for 26 mipped 
mkmsatrlEfr ranfcpa with an average rotation of IcM between fflM tnd D16S40S. 
Wtconwnwod contigi uing seven preriow^ locaKzed «queuce t^g rfte (STSs; 
DUSS41« D16S3035, Di«S3iM t DI6S3U7, PI6S770, D15S416, OltiS2fi2)} and tub- 
t&pa&T additioaa] oao (wi-928S» m-16505, 4%>I7274, 
**SG-30C35» D15S766) and 79 new STSs damd bom the cad dto^octt ofthe 

Udzted BAC dora*. 

Clonsa, BBquenchg mtf SNFs 

TheDNAofBAC done hbflTblO coittiifiing Dlffl313fi w» A ug mented bj t oaiaXion sad 
tobdooed b bactetiopliage MiA. Wt xwed ^ueiicafiOTbochmwdiof7lI$sobclone3 
iadfrnmdiraapriart>rdkiiigtore«mstruatte 

{http-i^wwphnpiaig). Ideo% sesdt in DNA dsttbuo testified two ovcxlappbg 
Kspenced fl/& (ACn0733^ Gaifl^ 

iwltn i l c J on the extended stqaeact with BLAST ?L4 in GenBmk icie»e Hi. identified 
10 DniBene'cWeiK. the fcEowing ESI cbaei ^ou i ym dlaa to tome of Hunt chiTtcn • 
were obtained fttwt Ae American Type CblhroCoflcrticn (htrpi/^^ 
tequsettd mnaplctelr to identify tddfttool tnuB9Crib«d regtav AI12S2I7, AA417810, 

AA02 ma, AIOW427UA910320, AA751M9. Ootw A1090427 «cd AA910520» 
a n i i-H WH ui i n g to h»13520l, we wed to acreen a hlnod lectocTtg cDKAUhrtry (na. 
93 tiaff?; ^rmtgenc), tnd fttric^cd 11 doaiet of tbe ZSDJ andiAig gac,- ; 

Atotiiof CTSTSxinciliyjciw^&omyuUUHy tranicrfbrtl tppencei {2ST 
loxzmlofiCf«]^(3L^r2jO predict^ 
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• ~TK^ttK*d frB'fo^g'^i^twff fbuirtbe OKA of tpi CD patient* «jiLtvw» nnaS£3ai£r 

jM«d* tSNP-l-fr, CkoBank aaaaadon muab^G^^/^GSTKSL ttticborftted to 
<fi£N$ c^Naiiani] Carter fer fiurtedjodfigf IflSnaiii^'ind- typed 6a the sun 

good* of 457 CD pttieatl. 159 ulcerative colitis patirntt tnd 103uxafixtod imrdxt£d 
indtednab. Att Variant afletea »cb confirmed by'ieqoaidiig m second fn ri q icii rf ftr tt 
impHficdlioii product. 

Genotppfc data were «aiy*cd for InJoge wing the NPL score of GcaeEnnfH- rlXL Data 
Horn linkage dbeqaQIhruim mapping af CD woe aIIn^]raed intttafl* wilh the trincmiiftiao 
dllfrffrilftp"™ ft* 3 ming a tingfe trio fang affected and both pttJCtal ptrfaoftf. 
Subsequently; the pedigree dleeqoSibrium test wi performed uang die POT 2,11 
program 1 to acufysc dala from all £mify ivl>tivei».We esti ma ted tilde frcqucudc* far 3 
groups 418 tmxebted CD pstlcats, 159 tdcenti«e cefitfc p*t£eats and 103 comrofe 
(induing 78 unaiftctcd, unrclat©d«poa$«» of CD pttienta and 25 anrekted CEPH &jc2y 
ino&bcn)* 

Ifeedved 5 £d*u^. accepted 54 Mwefc 20OL 
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fmW5 flfoffrf* * Aratnif f wflgTnitifltniy rijamdgr ftf thf gagfrO- 

intestistal tract, widch is thought to ttsutt horn the effect of 
etxrironxxiuentaL factora in a genetically pxtcttspottd host. A geoe 
locatioa in the perkenttonseric legion of chromosome 16> fSDU 
that contributes to ^osceptibility to Crohn's disease has been 
established through multiple linkage studies 1 -*, but the specific 
gene(s) has not been identified. NOD2, & gene that encodes a 
protein with homobgytoplairtdi^eiy sisUu^ 
located in the peak region of linkage on chrornosome 16 {vtL 7). 
Here we show, by usin^lhe tnnxsnusnon diseq mlihinm test and 
xase-controt analysis that a &ame$hift imitation cansed by a 
cytoaine insertkm, 302QmsQ which is expected to encode a 
truncated KOD2 protein, is associated with Crohn's disease. 
Wild-type NOD2 activates nuclear factor NF-kB, making it 
TCsponsfrt-to bacterial EpopolywnrhnridfS; howevw^this indue- 
tioa ^ defidentinntutant NOI^ 

- m susceptibility to Crohn's disedse a and suggest a fink between an 
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